Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Classification method for traditional Chinese medicine electronic medical records based on heterogeneous graph representation
Kaitian WANG, Qing YE, Chunlei CHENG
Journal of Computer Applications    2024, 44 (2): 411-417.   DOI: 10.11772/j.issn.1001-9081.2023030260
Abstract154)   HTML3)    PDF (1643KB)(144)       Save

Traditional Chinese Medicine (TCM) electronic medical records face challenges in data mining, low utilization rates, and difficulty in extracting meaningful information due to their complex and diverse structures, as well as non-standard diagnosis and treatment terminology. To address these issues, a TCM electronic medical record classification model called TCM-GCN was proposed based on Linguistically-motivated bidirectional Encoder Representation from Transformer (LERT) pre-training model and Graph Convolutional Network (GCN), and represented by a heterogeneous graph. The model was used to improve the extraction and classification of effective features in TCM electronic medical records. Firstly, the medical records were converted into sentence vectors using the word embedding method of the LERT layer and integrated into the heterogeneous graph to complement the overall semantic features that were missing in the graph structure. Next, to mitigate the negative impact of the structural characteristics on feature extraction, keywords were added to the nodes of the heterogeneous graph. The BM25 and Pointwise Mutual Information (PMI) algorithms were employed to construct edges representing the features of medical records, such as “medical record - keyword” and “keyword - keyword”. Finally, the task of medical record classification was completed by TCM-GCN, relying on the heterogeneous graph constructed by using LERT-BM25-PMI to aggregate and extract the feature relationships between medical records. Experimental results on the TCM electronic medical record dataset show that, compared to the suboptimal LERT, TCM-GCN achieves improvements of 2.24%, 2.38%, and 2.32% in accuracy, recall, and F1 value, respectively, after applying a weighted average, which confirms the effectiveness of the algorithm in capturing hidden features in medical records and classifying TCM electronic medical records.

Table and Figures | Reference | Related Articles | Metrics